Problems Creating Task-relevant Clone Detection Reference Data

نویسندگان

  • Andrew Walenstein
  • Nitin Jyoti
  • Junwei Li
  • Yun Yang
  • Arun Lakhotia
چکیده

One prevalent method for evaluating the results of automated software analysis tools is to compare the tools’ output to the judgment of human experts. This evaluation strategy is commonly assumed in the field of software clone detector research. We report our experiences from a study using several human judges who tried to establish “reference sets” of function clones for several medium-sized software systems written in C. The study employed multiple judges and followed a process typical for inter-coder reliability assurance wherein coders discussed classification discrepancies until consensus is reached. A high level of disagreement was found for reference sets made specifically for reengineering task contexts. The results, although preliminary, raise questions about limitations of prior clone detector evaluations and other similar tool evaluations. Implications are drawn for future work on reference data generation, tool evaluations, and benchmarking efforts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Electrophysiological Correlates of Change Detection during Delayed Matching Task: A Comparison of Different References

Detecting the changed information between memory representation and incoming sensory inputs is a fundamental cognitive ability. By offering the promise of excellent temporal resolution, event-related potential (ERP) technique has served as a primary tool for studying this process with reference of the linked mastoid (LM). However, given that LM may distort the ERP signals, it is still undetermi...

متن کامل

How to Make the Hidden Visible – Code Clone Presentation Revisited

Nowadays, a slew of clone detection approaches exists, producing a lot of clone data. This data needs to be analyzed, either manually or automatically. However, even after analysis, it is still not trivial to derive conclusions or actions from the analyzed data. In particular, we argue that it is often unclear how the cloning information should be presented to the user. We present our idea of t...

متن کامل

CATO: The Clone Alignment Tool

High-throughput cloning efforts produce large numbers of sequences that need to be aligned, edited, compared with reference sequences, and organized as files and selected clones. Different pieces of software are typically required to perform each of these tasks. We have designed a single piece of software, CATO, the Clone Alignment Tool, that allows a user to align, evaluate, edit, and select c...

متن کامل

Advancing Ethical Culture through Transformational Leadership for improved Public Service Delivery: Ugandan perspective

Fundamentally, public services must be of high quality so as to satisfy the wants and needs of the beneficiaries. But it’s worrisome to discover that in Uganda, public services provided are of poor quality due to unethical behaviors and wanting leadership. The purpose of this study is to show that transformational leadership can advance ethical culture to spur provision of quality services. A r...

متن کامل

A needle in the stack: efficient clone detection for huge collections of source code

One of the important uses of source code clone detection analysis is plagiarism detection, where a file is compared against a known corpus of source code to try to find potential matches. As the availability of Free and Open Source Software (FOSS) continues to increase it has become important to know if specific source code has been created from copies of FOSS software. Version 5.0.2 of Debian ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003